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Introduction 


This  project  was  a  joint  effort  of  the  David  Samoff  Research  Center  (Samoff),  Princeton 
University,  and  Robicon  Systems,  all  of  Princeton,  NJ.  It  was  a  multi-disciplinary  project  that 
consisted  of  three  sub-projects,  each  concerned  with  a  similar  kind  of  research  —  the  development 
of  artificial  adaptive  systems  with  capabilities  similar  to  those  of  their  biological  counterparts.  Re¬ 
cent  work  on  neural  networks  has  demonstrated  their  potential  for  solving  difficult  problems  in 
simplified,  controlled  environments.  The  next  stage  in  the  development  of  neural  networks  is  their 
extension  to  the  scale,  complexity,  and  variability  of  real-world  situations.  This  will  not  be  a  sim¬ 
ple  evolution  of  existing  neural  net  designs,  because  it  requires  the  integration  of  complex  adap¬ 
tive  systems  whose  components  have  widely  differing  functions.  Fortunately,  biological 
organisms  present  existing  solutions  to  this  problem  and  neuroscience  can  now  probe  in  detail  the 
relevant  structures.  Biological  systems  are  highly  adaptive  and  operate  well  in  extremely  complex 
and  variable  environments.  They  accomplish  this  by  partitioning  the  system  into  functional  sub¬ 
units  in  a  quasi-hierarchical  structure  of  neural  network  modules.  In  this  project  we  have  studied  a 
number  of  specific  examples  of  this  system  integration  strategy  and  have  modeled  their  operation 
for  the  purpose  of  creating  new  neural  network  architectures  and  control  schemes. 

The  three  sub-projects,  which  are  fully  described  in  Sections  I,  II,  and  III  of  this  report,  are 
introduced  below: 

I.  Self-Supervised  Learning  Within  a  System  of  Map-Like  Neural  Networks 

Many  of  the  nuclei  of  the  central  nervous  system  exhibit  map-like  architectures,  in  which 
neuronal  response  characteristics  exhibit  a  systematic  spatial  variation  over  the  nucleus.  These 
systems  are  examples  of  how  to  break  large,  complex  problems  down  into  smaller,  simpler  sub¬ 
problems,  and  an  understanding  of  them  may  provide  insight  into  the  construction  of  similarly 
powerful  solutions  in  the  technological  domain.  The  target  localization  system  in  the  bam  owl  is  a 
particularly  good  example.  The  results  include  biophysically  realistic  models  and  computer  simu¬ 
lations  of  the  auditory  localization  system  of  the  bam  owl.  We  have  produced  many  experimental 
predictions  and  greatly  increased  our  understanding  of  how  this  system  computes. 

II.  Modeling  Adaptive  Processing  in  the  Visual  Cortex 

This  project  investigated  the  adaptive  processing  of  motion  signals  by  the  visual  cortex.  In 
general,  this  system  can  be  described  as  a  chain  of  adaptive  sub-modules,  each  of  which  adjusts 
the  gain  of  selective  components  of  its  input  signal.  The  result  is  a  signal  for  which  change  in  var¬ 
ious  relevant  stimulus  dimensions  is  emphasized.  The  results  include  a  model  of  differential  mo¬ 
tion  sensitivity  in  the  cortex.  Such  results  are  useful  not  only  for  gaining  insight  into  neural 
function,  but  also  for  improving  the  sensitivity  of  artificial  vision  systems  to  specified  signal  di¬ 
mensions. 

III.  Hierarchical  Architectures  and  Integration  of  Neural  Networks  and  Knowledge-Based 

Systems  for  Intelligent  Robotic  Control 

The  objective  of  this  research  was  to  study  the  feasibility  of  using  robotic  skill  acquisition 
for  the  intelligent  control  of  highly  redundant,  anthropomorphic  robotic  manipulators.  The  control 
scheme  uses  models  of  human  motor  skill  acquisition  to  guide  the  integration  of  knowledge-based 
systems  and  neural  networks,  and  parallels  the  training  of  an  athlete  by  a  coach  whereby  the  robot 
learns  through  experience  how  to  perfect  tasks  initidly  specified  in  a  high  level  task  language. 
Knowledge-based  system  components  are  used  to  encode  neural  network  learning  strategies,  and 
skill  acquisition  is  associated  with  the  shift  from  a  predominantly  feedback-oriented,  knowledge- 
based  representation  of  control  to  a  predominantly  feedforward,  network-based  form.  Intelligent 
robotic  control  systems  have  been  constructed  with  a  hierarchical  and  modular  organization,  us¬ 
ing  antagonistic  actuation  mechanisms  and  multi-joint  motor  synergies. 
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Section  I 


Self-Supervised  Learning  Within  a  System  of  Map-Like  Neural  Networks 


A.  BACKGROUND 

Most  of  the  visual,  auditory,  and  somatosensory  nuclei  of  the  central  nervous  system  exhibit 
map-like  architectures,  in  which  neuronal  response  characteristics  exhibit  a  systematic  spatial 
variation  over  the  nucleus.  These  map-like  nuclei  appear  to  serve  as  modules  within  hierarchical 
and  parallel  computing  systems.  These  systems  are  examples  of  how  to  break  large,  complex 
problems  down  into  smaller,  simpler  sub-problems,  and  an  understanding  of  them  may  provide 
insight  into  the  construction  of  similarly  powerful  solutions  in  the  technological  domain.  In  addi¬ 
tion,  they  exhibit  other  useful  properties  for  man-made  computing  systems,  such  as  self-organiza¬ 
tion,  self-optimization,  and  fault-tolerance.  The  neural  substrate  for  target  localization  in  the  bam 
owl  is  one  such  system. 

The  bam  owl  can  hunt  in  total  darkness,  recognizing  and  locating  prey  by  hearing  alone. 
One  component  of  this  behavior  is  a  very  accurate  head-orienting  response  to  salient  sounds  (the 
head  must  rotate  as  the  eyes  are  immobile).  This  head  saccade  centers  the  sound-producing  object 
for  closer  visual  and  acoustic  scmtiny,  prior  to  aerial  attack.  In  the  laboratory,  owls  can  be  trained 
to  produce  naturalistic  head  saccades  to  controlled  sounds,  and  thus  indicate  the  perceived  sound 
location.  In  this  way,  the  bam  owl  has  been  shown  to  be  more  accurate  at  localizing  sounds  than 
any  other  terrestrial  animal  studied  thus  far  [12]. 

Considerable  progress  has  been  made  in  determining  the  acoustic  and  neural  bases  of  the 
head  saccade  (see  Fig.  1).  The  following  description  is  greatly  simplified,  as  its  purpose  is  limited 
to  providing  the  context  for  the  work  reported  here  (see  [15, 7]  for  recent  reviews).  The  acoustical 
properties  of  the  owl’s  head  and  ears  lead  to  the  encoding  of  stimulus  azimuth  and  elevation  by  in- 
teraural  time  delay  (ITD)  and  interaural  level  difference  (ILD),  respectively  [18].  In  effect,  associ¬ 
ated  with  each  direction  in  space  there  is  a  unique  relationship  between  frequency  (F),  ITD,  and 
ILD;  to  determine  the  direction  of  a  sound  source,  the  system  must,  in  effect,  compute  the  nonlin¬ 
ear  mapping  between  the  ITD  and  ILD  spectmm  of  the  sound  and  its  direction. 

The  binaural  ITD  and  ILD  information  is  extracted  in  two  steps.  First  the  monaural  timing 
and  intensity  information  are  separated  by  the  cochlear  nuclei  [26].  Second,  maps  representing  the 
ITD  [27]  and  ILD  [16]  spectra  are  produced  in,  respectively,  nucleus  laminaris  (NL)  and  the  nu¬ 
cleus  ventralis  lemnisci  lateralis  pars  posterior  (VLVp). 

Our  previous  modeling  suggested  that  the  merger  of  ITD  and  ILD  should  occur  in  two  stag¬ 
es,  in  order  to  avoid  the  problem  of  phantom  targets  in  multi-sound  environments  [22].  In  the  first 
stage,  presumably  the  lateral  shell  of  the  central  nucleus  of  the  inferior  colliculus  (ICL),  cells  are 
tuned  to  unique  combinations  of  ITD,  ILD,  and  frequency,  and  arranged  in  a  three-dimensional 
map.  In  the  second  stage,  each  of  the  ICL  neurons  projects  to  and  excites  the  region  of  the  space 
map  in  the  ICX  that  corresponds  to  the  direction  associated  with  the  ITD/ILD/F  triplet  to  which  it 
is  tuned.  Experimental  work  is  in  basic  agreement  with  this  model  [4],  but  many  details  remain  to 
be  worked  out.  In  any  case,  the  equivalent  of  an  “acoustic  retina”  is  found  in  the  external  nucleus 
of  the  inferior  colliculus  (ICX)  [13,  14]. 

In  the  optic  tectum,  projections  from  the  ICX  [10]  and  the  retina  produce  a  fused 
visual-acoustic  representation  of  target  direction  [11],  and  both  sensory  maps  are  in  register  with  a 
motor  map  of  head  saccade  vector  [3].  The  visual/auditory  alignment  must  be  dynamically  adjust¬ 
ed  while  the  head  is  growing,  because  the  changing  shape  of  the  head  alters  tte  relationship  be¬ 
tween  ITD,  ILD,  frequency,  and  sound  direction.  The  tectal  auditory  map  shifts  so  as  to  stay  in 
alignment  with  the  tectal  visual  map  [6, 8, 9].  This  could  easily  be  explained  by  correlation-driven 
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Figure  I-l.  Overview  of  the  neural  system  for  auditory  localization  in  the  bam  owl.  The  grids  indicate  the  map-like 
representation  of  information  at  each  processing  stage  or  nucleus.  The  “blobs”  indicate  the  pattern  of  neuronal  activa¬ 
tion  on  the  map  in  response  to  a  typical  stimulus.  Acronyms  NL,  VLVp,  ICL,  ICX,  ll'D,  and  ILD  are  defined  in  the 
text.  Arrows  indicate  the  direction  of  signal  flow.  Symbols  “az”  and  “el”  denote  azimuth  and  elevation,  respectively, 
while  F  denotes  frequency. 

synaptic  plasticity  acting  within  a  one-to-many  ICX-to-tectum  projection,  and  we  in  fact  proposed 
such  a  model  [19,  5].  However,  shortly  after  this  contract  began,  the  Knudsen  lab  demonstrated 
that  the  plasticity  is  upstream  of  the  tectum,  in  the  inferior  colliculus  [1,  17].  This  implies  that  the 
visual  feedback  must  be  indirect,  as  there  are  no  known  visual  sensory  inputs  to  the  inferior  colli¬ 
culus. 

B.  OBJECTIVES 

The  purpose  of  this  project  was  to  further  the  understanding  of  this  system  through  the  de¬ 
velopment  of  biophysical  and  computational  models  and  computer  simulations.  The  goal  was  to 
produce  explicit,  testable  predictions  for  neuroscience.  In  addition,  it  was  expected  that  this  re¬ 
search  would  lead  to  new  artificial  neural  network  designs,  with  applications  for  signal  process¬ 
ing,  sensory  fusion,  and  sensorimotor  integration. 

C.  RESULTS 

1.  Modeling  the  Intensity  System 

Instead  of  modeling  the  ICL-to-ICX  projection  and  visual/auditory  plasticity,  as  originally 
proposed,  we  chose  to  model  the  intensity  processing  system  in  the  VLVp  and  ICL.  This  change 
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was  made  for  severaJ  reasons  in  response  to  experimentaJ  reports  made  after  the  proposal  was 
written.  Fujita  [4]  reported  that  the  ICL  contained  both  ILD-tuned  and  ILD-sensitive  neurons,  and 
that  there  was  some  convergence  across  frequency  in  the  ICL.  Our  previous  model  of  the  ICL  in¬ 
corporated  only  ILD-tuned  cells  and  had  no  frequency  convergence.  Also,  as  mentioned  above, 
the  Knudsen  lab  showed  that  the  visual/auditory  plasticity  was  occurring  upstream  of  the  tectum 
[1,17].  Therefore  it  was  decided  to  first  construct  a  model  of  the  ICL  that  could  account  for  these 
findings  before  using  it  to  model  the  formation  of  the  space  map  in  the  ICX  and  visual/auditory 
fusion.  This  of  course  requires  models  of  the  inputs  to  the  ICL.  The  nature  of  the  representation  of 
ITD  prior  to  the  ICL  was  quite  well  understood,  however  the  representation  of  intensity  and  ILD 
prior  to  the  ICL  was  not.  However,  the  publication  of  Carr’s  anatomical  study  of  the  VLVp  [2], 
along  with  Manley’s  physiological  paper  [  1 5],  gave  us  enough  information  to  attempt  a  model  and 
simulation  of  the  VLVp.  Therefore,  we  decided  to  develop  joint  models  of  the  intensity  process¬ 
ing  of  the  VLVp  and  the  ICL.  The  anatomical  and  physiological  data  to  be  incorporated  and  ex¬ 
plained  by  our  models  of  the  VLVp  and  ICL  are  summarized  in  Fig.  1-2,  below. 

LEFT  RIGHT 


Figure  1-2.  The  intensity  processing  system  for  sound  elevation  of  the  owl.  On  the  left,  the  names  and  acronyms  of 
the  nuclei  are  given.  On  the  right,  the  salient  response  characteristics  are  named  and  illustrated  with  graphs  of 
stimulus  response.  The  dashed  curves  represent  the  response  with  an  increased  average  binaral  intensity  level 
(ABI).  The  “?”  indicates  that  the  ABI  dependence  is  not  completely  known.  The  connections  between  nuclei  are 
indicated  by  lines  with  arrows  and  the  inhibitory  and  excitatory  nature  is  indicated  by  and  respectively. 
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The  quantitative,  mathematical  network  models  developed  were  motivated  by  a  qualitative 
model  of  the  generation  of  ILD-selectivity,  called  the  “spatial-derivative  model”.  This  model,  il¬ 
lustrated  in  Fig.  1-3  (below),  was  developed  independently  by  a  number  of  investigators,  includ¬ 
ing  S.  Volman,  T.  Takahashi,  R.  Adolphs,  and  the  personnel  of  this  contract.  The  spatial- 
derivative  model  holds  that  a  tone  of  a  given  frequency  produces  a  wedge-like  pattern  of  activa¬ 
tion  in  the  VLVp,  and  that  for  a  fixed  ABI,  the  position  of  the  edge  varies  roughly  linearly  with 
ILD.  Such  a  pattern  of  activation  is  suggested  by  the  observed  dorso-ventral  variation  of  ILD- 
threshold  depicted  in  the  graphs  of  Fig.  1-2.  The  model  predicts  that  a  peak-like  pattern  of  activa¬ 
tion  is  created  along  the  medio-lateral  axis  of  the  ICL,  from  the  wedge-like  pattern  in  the  VLVp. 
The  location  of  the  peak  of  activation  would  vary  roughly  linearly  with  ILD,  which  is  clearly  con¬ 
sistent  with  ILD-tuned  responses. 


Figure  1-3.  The  spatial-derivative  mode!  of  ILD-selectivity  in  the  ICL.  The 
curves  illustrate  the  hypothetical  activation  patterns  within  the  VLVp  and  ICL 
for  stimuli  with  ILD’s  of  lOdB  (solid)  and  -5dB  (grey). 


The  first  version  of  the  network  model  of  VLVp/ICL  was  presented  in  December,  1989  at 
NIPS  [23],  and  at  the  Annual  Meeting  of  the  Society  for  Neuroscience.  The  anatomy  of  this  model 
is  summarized  in  Fig.  1-4,  below. 


LEFT 


RIGHT 


sensitive  tuned 


Figure  1-4.  First  version  of  the  VLVp/ 
ICL  model.  The  size  of  the  circles  or  tri¬ 
angles  denote  the  number  density  of  the 
inhibitory  or  excitatory  cells,  respec¬ 
tively.  The  afferents  from  NA  linearly 
encode  the  stimulus  intensity,  and  pro¬ 
vide  the  same  level  of  excitatory  drive 
to  each  VLVp  neuron.  The  criss-cross 
lines  indicate  the  reciprocal  commis¬ 
sural  inhibitory  connections  between 
the  VLVp. 


5 


All  VLVp  neurons  receive  the  same  level  of  excitatory  drive  from  NA.  There  is  a  criss-cross 
pattern  of  commissural  inhibitory  connections,  which  are  reciprocal.  Cells  at  one  dorsal-ventral 
position  inhibit  the  cells  at  the  corresponding  position  of  the  contralateral  VLVp,  and  vice  versa. 
The  reciprocity  is  not  symmetric,  as  the  ventral— >dorsal  connection  is  stronger  than  the  dorsal- 
ventral  connection  because  of  the  gradient  of  inhibitory  cell  density.  The  reciprocally  connected 
cells  thus  compete-whichever  cells  initially  receive  more  excitation  than  inhibition  will  drive  the 
output  of  their  contralateral  competitor  towards  zero  while  their  own  output  rises  towards  the 
maximal  level.  Simulations  showed  that  a  wedge-like  pattern  of  activation  is  produced  (see  Fig.  I- 
5),  in  which  the  dorsal-ventral  position  of  the  “edge”  of  activation  is  linearly  related  to  the  stimu¬ 
lus  ELD  (see  Fig.  I- 10),  consistent  with  the  spatial  derivative  model.  This  model  was  called  the 
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Figure  I>5.  Wedge-like  dorsal/ventral  activation  patterns  in  the  VLVp. 


“competitive  edge  model”  of  the  VLVp.  The  model  also  mimicked  the  VLVp  stimulus  response 
data  of  Manley,  Koppl  and  Konishi  [16],  and  the  ICL  stimulus  response  data  of  Fujita  [4]. 

Early  in  1990  a  refinement  of  the  VLVp  model  was  made.  A  second  population  of  local  in¬ 
hibitory  neurons  was  added,  which  received  the  same  inputs  as  the  commissurally  projecting  neu¬ 
rons,  but  had  the  opposite  density  gradient,  being  more  numerous  at  the  dorsal  surface,  and  less 
numerous  at  the  ventral  surface.  These  neurons  acted  to  buffer  the  competitive  dynamics,  reduc¬ 
ing  the  time  it  took  the  edge  to  form  and  reach  its  final  position.  This  refined  model  was  presented 
at  the  first  meeting  of  the  AMNS  Workshop  (July,  1990)  [24],  which  included  an  in-depth  review 
of  the  computational  methods. 

The  ICL  model  was  also  modified  during  1990,  in  light  of  recent  unpublished  data  gleaned 
from  discussions  with  experimentalists  following  our  1989  Society  of  Neuroscience  presentation. 
The  main  factor  was  that  the  output  from  the  VLVp  to  the  ICL  was  inhibitory,  and  not  excitatory 
as  we  had  assumed  in  the  first  version  of  the  model.  This  change  could  be  made  while  staying 
within  the  framework  of  the  spatial  derivative  model  (described  above).  The  architecture  of  this 
second  model  is  illustrated  in  Fig.  1-6,  below.  In  revising  the  ICL  model,  we  were  greatly  aided  by 
discussions  with  Ralph  Adolphs  of  CalTech,  a  graduate  student  in  the  Konish  Lab.  Ralph  was  then 
performing  studies  of  the  effects  of  injecting  various  neural  activity  modulators  in  the  VLVp.  The 
second  version  of  the  VLVp/ICi.  model  was  first  presented  at  the  1990  Annual  Meeting  of  the  So¬ 
ciety  for  Neuroscience,  and  it  included  simulations  which  mimicked  Ralph’s  neuromodulator  ex¬ 
periments.  This  work  was  also  presented  at  the  2nd  AMNS  workshop  in  the  summer  of  1991  [28]. 

One  of  the  benefits  of  our  modeling  approach  was  that  we  could  easily  explore  the  temporal 
or  dynamic  consequences  of  model  parameters.  All  model  parameters  had  real  physical  units,  and 
we  could  unambiguously  relate  the  difference  equation  temporal  change  unit  to  a  specific  unit  of 
physiological  time.  For  example,  we  could  measure  how  long  it  took  for  the  dynamic  competition 
in  the  VLVp  to  reach  an  equilibrium,  in  terms  of  milliseconds.  This  is  what  led  us  to  introduce  the 
local  inhibitory  cells  in  the  VLVp  (described  above).  Fig.  1-7  shows  the  pattern  of  activation  in  the 
model  for  the  first  ten  milliseconds  following  stimulus  onset.  Such  a  sequence  could  be  displayed 
in  very  rapid  sequence  on  the  SUN  workstations  we  used- 100  milliseconds  of  network  time  could 
be  simulated  and  concurrently  viewed  in  a  few  seconds.  This  enabled  ideas  to  be  tried  out  and  pa¬ 
rameters  tuned  very  quickly. 
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LEFT  RIGHT 

Figure  1-6.  Second  version  of  the  VLVp/ICL  model.  The  stippling  within  the  cells 
indicates  the  degree  of  stimulus-driven  activation  (the  darker  the  more  active). 

Within  the  last  year  Ralph  published  detailed  anatomical  tracing  studies  [30]  that  suggest 
that  the  inter- VLVp  inhibition  is  feedforward,  rather  than  feedback,  as  it  was  assumed  to  be  in  our 
first  (competitive-edge)  model  [28].  This  finding  contradicts  earlier  work  by  Takahashi  [33], 


Figure  1-7.  VLVp/ICL  activation  pattern  sequence.  Each  of  the  nine  panels  depicts  the  pattern  of  activation 
along  the  left  and  right  VLVp,  and  the  sensitive  and  tuned  cells  in  the  ICL,  as  indicated  in  the  upper-left-hand 
panel.  The  numeric  label  (0,  1,2,  10)  is  the  number  of  milliseconds  following  onset.  The  patterns  had  stabilized 
by  10  milliseconds. 
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Figure  1-8.  Comparison  of  competitive-edge  (top)  and  feedforward  (bottom)  models  of  the  VLVp.  The  firing  rate  as  a 
function  of  ILD  and  ABl  is  plotted.  The  5  plots  shown  are  taken  from  neurons  spanning  the  dorsal-ventral  dimension 
of  the  nucleus  (depths  of  25%,  37.5%,  50%,  62.5%  and  75%),  as  can  be  surmised  from  the  progressively  shifting 
ILD  thresholds. 

whose  HRP  tracings  were  consistent  with  the  competitive-edge  model.  In  any  case,  we  decided  to 
explore  whether  a  feedforward  model  could  match  the  data  of  Manley  et  aJ.  [16]  as  well  as  the 
competitive-edge  model  did.  Our  approach  to  this  question  was  to  train  a  static  feedforward  neu¬ 
ral  network  to  match  the  steady-state  output  of  the  competitive-edge  model,  using  the  techniques 
developed  in  the  field  of  artificial  neural  networks.  Work  just  completed  demonstrates  that  the 
feedforward  model  can  be  trained  so  that  the  response  curves  are  very  similar  to  those  of  the  com¬ 
petitive-edge  model,  as  shown  in  Fig.  1-8.  The  competitive-edge  model  is  a  closer  match  to  the 
known  anatomy  in  several  other  ways,  and  in  our  opinion,  is  still  the  best  candidate.  More  ana¬ 
tomical  experiments  will  have  to  be  done  to  distinguish  between  these  models.  This  is  somewhat 
surprising,  since  the  dynamics  of  the  two  models  are  so  intrinsically  different.  This  work  was  pre¬ 
sented  at  the  Society  for  Neuroscience  Annual  Meeting  this  year  [32]  and  a  manuscript  is  in  prep¬ 
aration  for  submission  to  the  Journal  of  Neuroscience. 

Adolph’s  [30]  also  presented  evidence  that  the  VLVp->  ICL  projection  is  bilateral,  with  the 
ipsilateral  projection  weaker  than  the  contralateral  projection.  Previous  work  by  Takahashi  (un¬ 
published,  personal  communication.  Summer  1988)  had  revealed  a  contralateral-only  projection, 
and  this  finding  was  an  assumption  of  our  previous  ICL  model.  Conceptually,  the  presence  of  a  bi¬ 
lateral  projection  does  not  invalidate  our  previous  model,  and  in  fact,  bilateral  projections  could 
be  incorporated  in  such  a  way  that  the  resulting  ICL  responses  and  the  underlying  computational 
model  (“spatial  derivative  model”)  would  be  the  same.  However,  a  bilateral  projection  does  pro¬ 
vide  more  degrees  of  freedom,  and  we  were  interested  if  perhaps  an  entirely  different  computa¬ 
tional  scheme  could  be  implemented  using  it. 

We  were  especially  interested  in  deriving  a  new  model  of  the  VLVp  ~>  ICL  projection  for 
which  the  dependence  on  the  average  binaural  intensity  (ABI)  would  be  less  in  the  ICL  than  in  the 
VLVp.  Our  previous  ICL  model  [28]  was  just  as  dependent  on  ABI  as  the  VLVp.  This  is  because 
the  underlying  spatial  derivative  model  is  based  on  point-to-point,  topographic  projections,  and  so 
the  ICL  must  inherit  the  same  degree  of  ABI  dependence  as  the  VLVp.  The  degree  to  which  neu¬ 
rons  in  the  ICL,  ICX,  and  optic  tectum  are  independent  of  ABI  has  not  been  extensively  studied.  It 
has  generally  been  maintained  that  ICX  and  tectal  cells  are  relatively  independent  of  ABI  [7].  This 
is  what  one  would  expect,  since  these  neurons  are  thought  to  encode  sound  source  direction, 
which  is  independent  of  ABI.  However,  there  is  also  evidence  from  Olsen  et  al.  [31]  that  tectal 
cells  show  the  same  kind  of  dependence  on  ABI  that  VLVp  cells  do  [16]. 
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Figure  1-9.  Comparison  of  the  “spatial-derivative”  (top)  and  the  “ABI-independent  trained”  (bottom)  models  of  the 
ICL.  The  firing  rate  as  a  function  of  ILD  and  ABI  is  plotted  for  the  tuned  cells.  The  5  plots  shown  are  taken  from  neu¬ 
rons  spanning  the  medio-lateral  dimension  of  the  nucleus  (relative  distances  from  one  border  of  25%,  37.5%,  50%, 
62.5%  and  75%),  as  can  be  surmised  from  the  progressively  shifting  ILD  peaks.  Note  the  striking  difference  in  the 
ABI  dependence. 


Our  approach  was  to  use  the  training  methods  of  artificial  neural  networks  to  derive  the  con¬ 
nections  between  the  VLVp  and  the  ICL,  as  well  as  those  within  the  ICL.  We  found  that  nearly 
ABI  independent  cells  can  be  produced  in  the  ICL,  as  illustrated  in  Fig.  1-9.  However,  the  ipsilat- 
eral  projection  from  the  VLVp  was  not  essential  for  this.  Models  with  the  full  bilateral  projection 
were  slightly  more  ABI  independent  than  those  with  a  contralateral-only  projection,  but  the  differ¬ 
ence  was  not  large. 

As  expected,  analysis  of  the  resulting  trained  network  revealed  that  it  is  based  on  a  different 
computational  scheme  than  the  spatial-derivative  model.  Rather  than  being  point-to-point,  the 
projection  onto  an  ICL  neuron  comes  from  a  wide  region  of  the  VLVp,  as  illustrated  in  Fig.  I- 10. 
At  large  ILD  the  cell  fires  at  its  highest  rate.  The  excitatory  input  from  NA  is  at  a  maximum  and 
the  inhibition  from  VLVp  is  at  a  minimum,  since  the  active  VLVp  cells  are  to  the  right  of  the  con¬ 
nection  peak.  As  ILD  decreases  so  does  the  cells  firing  rate.  The  VLVp  inhibition  increases  as  the 
wedge  of  activation  extends  into  the  range  of  the  connection  peak,  and  the  excitation  from  NA  de¬ 
creases.  ABI  independence  is  achieved  through  a  balancing  act  between  excitation  and  inhibition. 
For  a  given  ILD,  as  ABI  increases,  the  activation  pattern  in  the  VLVp  shifts  such  that  the  inhibi¬ 
tion  to  ICL  increases.  At  the  same  time,  the  excitation  from  NA  increases  and  nullifies  the  in¬ 
creased  inhibition.  This  “ABI-independent  trained”  model  shows  one  way  in  which  ABI 
independence  can  be  achieved.  Experimental  work  is  needed  to  measure  the  actual  degree  of  ABI 
independence,  and  confirm  the  nature  of  the  predicted  connection  patterns.  This  work  was  pre¬ 
sented  at  the  CNS  Conference  this  summer,  at  the  Society  for  Neuroscience  Annual  Meeting  this 
year  [32],  and  a  manuscript  is  in  preparation  for  submission  to  the  Journal  of  Neuroscience. 


2.  Modeling  Time  Delay  Hyperacuity  in  Nucleus  Laminaris 

The  auditory  system  of  the  bam  owl  contains  neurons  sensitive  to  the  phase  of  sounds  of  re¬ 
markably  high  frequency,  up  to  9  kHz.  Nucleus  Laminaris  (NL)  represents  phase  differences  as 
part  of  the  computation  of  stimulus  azimuth  [27].  The  input  to  NL  is  from  both  of  the  monaural 
magnocellular  nuclei  (NM).  NM  neurons  encode  stimulus  phase  or  time  by  firing  action  poten¬ 
tials  preferentially  near  a  particular  phase  of  the  stimulus  [26].  However,  there  is  significant  jitter 
in  the  phase  at  which  the  action  potentials  occur,  which  is  noise  in  the  input  to  NL.  Furthermore, 
NM  neurons  cannot  fire  during  every  period  of  the  sound  at  such  high  frequencies,  so  the  number 
of  spikes  arriving  at  a  laminaris  neuron  from  each  side  of  the  head  varies  considerably  from  one 
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Figure  I-IO.  Analysis  of  a  sensitive  cell  in  the  “ABI-independent  trained”  model  of  the  ICL.  The  upper-left 
panel  is  the  ABI  vs.  ILD  (or  BID)  response  of  the  cell  (compare  with  Fig.  1-8  to  see  the  reduction  in  ABI-de- 
pendence,  this  reduction  was  much  more  dramatic  in  other  sensitive  cells).  The  upper-right  panel  shows,  for 
a  given  ILD,  the  nonlinear  shift  of  the  VLVp  activation  pattern  as  a  function  of  ABI.  The  lower-left  panel 
shows,  for  a  given  ABI,  the  linear  shift  of  the  same  pattern  as  a  function  of  HD.  The  lower-right  panel  is  the 
connection  strength  pattern  from  the  VLVp  onto  the  sensitive  cell  in  the  ICL.  The  solid  and  d^hed  lines  are 
the  contralateral  and  ipsilateral  connections,  respectively. 

sound  period  to  the  next,  giving  an  additional  source  of  noise.  The  high  frequency  of  the  stimulus 
and  the  high  level  of  noise  in  the  input  spike  trains  make  the  response  properties  of  laminaris  neu¬ 
rons  hard  to  explain,  and  casts  doubt  on  the  common  picture  of  NL  neiKons  as  coincidence  detec¬ 
tors.  We  used  simulations  and  semi-numerical  analysis  to  show  that  the  cellular  and  synaptic 
time  constants  must  be  very  fast,  probably  unreasonably  so,  in  order  for  ordinary  biophysical 
mechanisms  to  reproduce  the  observed  behavior. 

Several  people  have  suggested  that  a  resonance  mechanism  may  exist  in  laminaris  neurons 
to  amplify  the  signal.  We  investigated  a  simple  neuronal  resonance  model  that  improved  the  per¬ 
formance  considerably,  but  the  synaptic  and  cellular  time  constants  stiU  had  to  be  very  fast,  and 
we  did  not  propose  a  specific  biophysical  resonance  mechanism.  This  work  was  published  in  the 
proceedings  of  the  second  AMNS  Workshop  [29]. 
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There  is  one  peculiar  feature  of  NL  that  may  explain  its  ability  to  deal  with  high  frequencies. 
In  the  presence  of  a  sound,  there  is  an  extra-cellular  potential  in  NL  that  oscillates  in  phase  with 
the  sound.  This  is  called  the  neurophonic  potential.  Its  exact  amplitude  has  not  been  measured, 
but  it  may  be  in  the  range  of  1  to  lOmv  [Ted  Sullivan,  personal  communication].  The  most  likely 
sources  of  the  neurophonic  are  the  NM  axons,  which  are  carrying  phase-locked  spikes  whose  ex¬ 
ternal  fields  would  add  coherently.  This  signal  has  relatively  little  noise  simply  because  it  is  an 
average  over  thousands  of  the  noisy  signals  from  individual  NM  axons.  We  have  calculated  how 
a  passive  model  of  an  NL  neuron  with  the  experimentally  observed  cell  morphology  would  re¬ 
spond  to  such  an  oscillating  external  potential.  In  essence,  the  cell  acts  like  an  electrode.  The 
membrane  at  the  cell  body  conducts  very  well  at  frequencies  above  5  kHz,  and  the  myelinated  ax¬ 
on’s  membrane  does  not.  The  oscillating  potential  near  the  soma  propagates  through  the  soma’s 
membrane  and  down  the  axon.  As  a  result,  the  oscillating  part  of  the  potential  difference  across 
the  membrane  is  quite  small  at  the  soma,  but  grows  in  magnitude  to  a  maximum  of  significant 
size  at  some  distance  down  the  axon.  Voltage-dependent  channels  respond  to  the  potential  differ¬ 
ence,  so  if  they  can  respond  at  these  frequencies  they  can  respond  to  the  oscillating  potential  dif¬ 
ference.  This  would  be  a  much  cleaner  signal  than  the  synaptic  input  from  NM. 

This  model  has  two  appealing  features:  (1)  The  computation  of  potential  difference  has 
very  few  assumptions.  The  only  unknowns  are  the  magnitude  of  the  external  potential  and  the 
ability  of  the  neuron  to  fire  in  response  to  the  high-frequency  potential  difference.  The  first  of 
these  unknowns  needs  to  be  addressed  experimentally,  the  second  can  be  investigated  through 
simulations.  (2)  The  model  provides  an  explanation  for  the  unusual  appearance  of  NL  neurons  in 
electron  micrographs,  especially  the  lack  of  a  spike-initiating  zone  at  the  beginning  of  the  axon. 
These  observations  were  made  by  Catherine  Carr,  who  suggested  that  spikes  may  be  initiated  at 
the  first  node  in  the  axon,  but  there  was  no  known  reason  for  the  neurons  to  have  this  structure. 

3.  Other  Related  Work 

In  addition  to  his  role  as  consultant  to  the  research  effort  at  Samoff,  Dr.  Sullivan  pursued  a 
number  of  neurocomputational  research  topics  related  to  the  theme  of  this  contract.  The  following 
is  his  report. 

Past  work  by  myself  an  others  had  shown  that  the  processing  of  information  about  stimulus 
timing  and  intensity  are  physically  separated,  and  that  neurons  in  the  brainstem  regions  responsi¬ 
ble  for  these  two  functions  are  anatomically  distinct.  However,  while  we  know  a  lot  about  the 
anatomy  and  physiology  of  neurons  in  both  the  time  and  intensity  pathways,  we  have  a  poor  un¬ 
derstanding  of  the  relationship  between  a  neuron's  anatomical  structure  and  its  physiological 
function.  In  my  work  on  the  auditory  brainstem,  I  have  found  that  questions  of  structure-function 
interrelationships  are  best  approached  in  systems  for  which  the  physiological  function  of  a  partic¬ 
ular  neuron  is  fairly  well  understood.  That  is,  it  is  easier  to  ask  why  a  cell  with  a  specific  process¬ 
ing  function  has  a  particular  anatomical  structure  than  it  is  to  ask  what  the  function  of  a  given  cell 
with  a  known  structure  might  be.  I  have  used  this  approach  to  investigate  the  possible  role  of  den¬ 
dritic  processes  in  neurons  that  compute  horizontal  sound  localization  by  measurement  of  interau- 
ral  time  differences  and  to  examine  what  advantages  dendrites  might  provide  to  neurons 
specialized  for  processing  information  about  stimulus  intensity.  In  these  studies  and  others  de¬ 
signed  to  investigate  physiological  mechanisms  of  both  time  and  intensity  processing,  I  have 
come  to  realize  more  clearly  that  the  physiological  mechanisms  available  to  optimize  selectivity 
in  the  time  domain  are  drastically  different  and  often  diametrically  opposed  to  those  that  work 
best  for  intensity.  My  work  is  beginning  to  provide  clear  physiological  explanations  for  the  func¬ 
tional  segregation  that  is  observed  in  the  auditory  system  and  suggests  that  an  understanding  of 
cellular  mechanisms  can  also  help  to  explain  higher  levels  of  neuronal  organization  as  well.  The 
time  and  intensity  segregation  seen  in  the  auditory  system  can  also  provide  insights  into  the  simi¬ 
lar  organization  of  other  sensory  systems  since  for  any  sense  for  which  the  stimulus  is  a  form  of 
energy  (e.g.,  sound,  light,  touch),  both  the  spatial  pattern  of  energy  distribution  (i.e.,  intensity) 


II 


across  the  sensory  receptors  and  the  temporal  pattern  changes  in  this  distribution  must  be  neuraJly 
encoded 


a.  Dendritic  function  in  nucleus  laminaris 

I  have  refined  and  extended  a  model  for  dendritic  function  in  binaural  time  comparison. 
Earlier  theoretical  and  empirical  work  has  established  that  this  computation  involves  a  cellular 
process  called  coincidence  detection  in  which  a  cell's  spike  output  depends  on  its  receiving  at 
least  two  separate,  temporally  synchronized  synaptic  inputs.  Neurons  that  perform  this  task  at 
low  frequencies  have  a  pair  of  long  dendrites,  each  dendrite  being  innervated  by  synaptic  inputs 
derived  from  the  ear  opposite  to  those  impinging  on  the  other  dendrite.  The  modeling  results  sug¬ 
gest  that  these  bipolar  dendrites  enhance  the  cell's  selectivity  for  simultaneous  inputs  impinging 
on  both  dendrites  as  compared  to  coincidences  of  two  inputs  arriving  on  the  same  side.  This  func¬ 
tion  requires  electrical  isolation  between  the  synaptic  inputs  from  the  two  ears  and  therefore  can¬ 
not  be  done  without  dendrites.  However,  the  mechanism  exploits  a  fundamental  property  of 
neuronal  synaptic  transmission  (voltage  saturation)  and  is  therefore  a  general  candidate  for  den¬ 
dritic  functions  involving  sensitivity  or  selectivity  for  specific  spatial  or  temporal  combinations  of 
synaptic  input.  Further  analysis  of  the  model's  predictions  using  more  realistic  periodic  synaptic 
inputs  shows  that  aspects  of  dendritic  morphology  such  as  length,  branching  patterns,  and  number 
can  be  understood  in  the  context  of  the  basic  mechanism  I  am  proposing. 

b.  Comparison  of  binaral  phase  processing  at  high  and  low  frequencies 

Anatomical  and  physiological  evidence  indicates  that  the  possible  mechanisms  of  binaural 
lime  comparison  that  I  have  described  do  not  (and  in  fact  cannot)  operate  in  neurons  that  perform 
this  task  at  high  frequencies  (>5000  cycles/sec)  in  the  bam  owl.  These  cells  have  no  dendrites  and 
also  have  a  different  axon  morphology.  A  large  portion  of  the  work  was  devoted  to  understanding 
the  physiological  process  that  can  enable  timing  information  to  be  extracted  from  signals  whose 
time  course  is  much  faster  than  what  is  normally  considered  for  neuronal  processes.  My  investi¬ 
gations  show  that  both  the  temporal  properties  of  synaptic  input  (transmitter  release,  post-synaptic 
change  in  electrical  properties)  and  the  mechanisms  of  spike  output  need  to  be  examined  and  that 
with  modest  changes  in  both  of  these  areas,  the  function  of  these  cells  can  be  explained.  Most  re¬ 
cently,  I  have  been  studying  the  relationship  between  the  stochastic  behavior  of  the  action  poten¬ 
tials  in  the  input  neurons  and  the  patterns  of  synaptic  conductance  change  seen  by  the  coincidence 
detector  cells.  These  ongoing  studies  are  providing  some  interesting  and  insightful  results  that 
should  help  to  confirm  the  functional/physiological  dichotomy  between  temporal  and  level  (inten¬ 
sity)  processing  mechanisms  discussed  above. 

c.  Cellular  mechanisms  of  intensity  processing  in  nucleus  angularis 

I  have  applied  a  similar  logic  to  the  one  used  to  investigate  time  comparison  mechanisms  to 
an  analysis  of  dendritic  function  in  the  processing  of  stimulus  intensity.  In  this  case,  I  have  con¬ 
cluded  that  some  of  the  intensity  averaging  functions  that  had  been  thought  to  be  done  by  den¬ 
drites  are  not  likely  to  be  what  dendrites  are  for  since  these  functions  can  be  done  more  efficiently 
in  an  adendritic  cell.  Rather,  I  am  proposing  a  novel  dendritic  function  for  these  cells:  enhance¬ 
ment  of  the  dynamic  range  of  synaptic  strength  between  threshold  and  saturation.  This  and  other 
work  that  I  have  done  suggests  that  the  comparison  of  optimal  morphological  parameters  obtained 
with  different  assumption  about  function  is  likely  to  provide  a  powerful  approach  to  both  the  the¬ 
oretical  and  empirical  investigation  of  interactions  between  anatomy  and  physiology.  I  have  be¬ 
gun  a  collaboration  with  Dr.  Cathrine  Carr  at  the  University  of  Maryland  designed  to  test  some  of 
die  predictions  of  this  work. 
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Section  II 


Modeling  Adaptive  Processing  in  the  Visual  Cortex 


A.  BACKGROUND 

Hubei  and  Wiesel  (1959,  1962,  1968),  in  exploring  the  visual  cortex  of  the  cat  and  monkey, 
found  cells  that  responded  selectively  to  motion  of  a  bar  or  edge  in  a  particular  direction.  Psycho¬ 
physical  evidence  for  similar  direction-selective  mechanisms  in  humans  comes  from  Levinson 
and  Sekuler  (1975)  and  Watson,  Thompson,  Murphy,  and  Nachmias  (1980).  In  these  studies,  con¬ 
trast  thresholds  were  measured  for  the  detectability  of  a  drifting  sine  grating,  as  a  function  of  the 
contrast  of  a  simultaneously  present  grating  component  (the  mask)  drifting  in  the  opposite  direc¬ 
tion.  The  data  showed  that  contrast  thresholds  were  largely  unaffected  by  sub-threshold  mask 
contrasts,  suggesting  the  existence  in  human  vision  of  direction-selective  mechanisms  whose  re¬ 
sponses  are  independent  of  other  concurrently  responding  mechanisms,  at  least  at  low  stimulus 
contrast  levels. 

However,  when  stimulus  contrasts  rise  above  detection  threshold,  this  independence  among 
mechanisms  appears  to  break  down.  Stromeyer,  Kronauer,  Madsen  and  Klein  (1984)  measured 
thresholds  for  changes  in  contrast  of  one  or  both  components  of  a  counterphase  grating;  i.e.,  the 
sum  of  two  gratings  of  equal  high  contrast,  and  equal  spatial  and  temporal  frequencies,  drifting  in 
opposite  directions  at  the  same  velocity.  Their  <frtta  showed  much  lower  thresholds  for  changes 
involving  the  simultaneous  increment  and  decrement  of  the  two  grating  components,  over  chang¬ 
es  involving  increments  of  both  components.  This  reduced  effectiveness  of  the  simultaneous  in¬ 
crement  is  not  predicted  by  a  model  in  which  the  mechanisms  sensitive  to  the  two  stimulus 
components  are  responding  independently  of  each  other.  Instead,  it  suggests  an  inhibitory  inter¬ 
action  in  which  each  stimulus  component  is  reducing  the  detectability  of  the  other.  This  inhibito¬ 
ry  interaction  can  be  thought  of  as  a  gain-setting  operation  among  the  cortical  mechanisms 
responding  to  a  particular  visual  signal. 

B.  OBJECTIVES 

The  purpose  of  the  work  performed  under  this  contract  was  to  analyze  quantitatively  the 
form  and  possible  utility  of  this  cortical  gain-setting  operation.  Using  the  signd  domain  of  spa- 
tiotemporal  variations  in  luminance,  we  undertook  three  tasks: 

Task  1:  Perform  parametric  psychophysical  measurements  of  contrast  discrimination  among  sim¬ 
ple  motion  stimuli. 

Task  2:  Develop  a  cortical  gain-control  model  that  predicts  the  results  of  Task  1 . 

Task  3:  Investigate  signal-enhancing  properties  of  the  model  developed  in  Task  2. 

One  result  of  Task  3,  to  be  discussed  below,  is  that  the  gain-control  model  exhibits  a  noise¬ 
cleaning  function  for  quantum  noise  in  spatio-temporal  signals. 

C.  RESULTS 

1.  Task  1:  Psychophysical  measurements 

In  all  experiments,  contrast  discrimination  thresholds  were  measured  among  stimuli  com¬ 
posed  of  the  sum  of  a  leftward  and  rightward  drifting  sine  grating  of  equal  spatial  frequency  and 
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equal  but  opposite  drift  rate.  In  all  cases,  the  only  adjustable  stimulus  parameters  were  the  con¬ 
trasts  of  the  two  grating  components.  Thus,  all  stimuli  can  be  represented  as  points  in  a  two-di¬ 
mensional  space  of  these  two  contrast  settings,  and  the  psychophysical  task  can  be  described  in 
this  same  space  (call  it  m2space)  as  follows:  For  a  given  point  m2space  (the  mask  point)  and  a 
given  direction  of  excursion  from  that  point  (the  test  vector  direction),  find  the  magnitude  of  ex¬ 
cursion  for  which  observers  reliably  detect  a  difference  between  the  stimulus  at  the  mask  point 
and  the  stimulus  along  the  test  vector.  For  measurements  along  a  range  of  different  test  vector  di¬ 
rections  from  the  same  mask  point,  the  data  can  be  represented  as  closed  discrimination  contours 
around  the  mask  point,  analogous  to  the  Macadam  ellipses  of  color  discrimination  measurements. 

Fig.n-l  (below)  shows  some  typical  discrimination  contour  results.  The  solid  line  contours 
show  the  predictions  of  the  model  developed  in  Task  2,  described  below.  The  main  point  to  ob¬ 
serve  here  about  these  data  is  that  the  sets  of  threshold  points  (open  circles  with  error  bars)  from 
each  mask  point  (-»-  signs)  tend  to  group  themselves  along  rays  from  the  origin  of  m2space.  This 
grouping  suggests  a  gain  control  process  in  which  the  outputs  of  mechanisms  sensitive  to  opposite 
directions  of  motion  inhibit  each  other  with  a  division-like  operation.  Assuming  that  each  mecha¬ 
nism  responds  linearly  to  the  contrast  of  its  input  pattern,  the  grouping  occurs  because  all  points 
along  a  single  ray  have  the  same  ratio  of  responses  of  the  two  mechanisms. 


0  .05  .1  .15  .2 

right-ward  contrast 


Figure  II-l.  Discrimination  contours  for  observer  ARI. 
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1.  Task  2:  Cortical  gain  control  model 

Fig.  II-2  shows  flow  diagram  for  the  cortical  gain  control  model.  The  model  consists  of  four 
main  stages: 

1.  A  direction-selective  mechanism  stage,  at  which  the  responses  of  simple  mechanisms  sen¬ 
sitive  to  leftward  and  rightward  drifting  gratings  are  calculated. 

2.  A  mechanism  com*'*nation  stage,  at  which  the  outputs  of  the  direction-selective  mecha¬ 
nisms  are  combined  to  produce  two  opponent  and  one  non-opponent  mechanism  outputs. 

3.  A  transduction  stage,  at  which  each  of  the  three  outputs  from  the  previous  stage  is  passed 
through  its  own  sigmoid  non-linearity. 

4.  A  decision  stage,  at  which  changes  in  the  three  outputs  of  the  transduction  stage  from  the 
mask  to  the  test+mask  stimulus  presentations  are  used  to  choose  which  of  the  two  stimuli 
contained  the  test. 

At  the  core  of  the  model  is  the  opponent  division  operation  indicated  by  the  circular  opera¬ 
tors  in  the  flow  diagram.  This  opponent  division  turns  out  to  have  some  interesting  noise  cleaning 
properties,  as  will  be  discussed  in  the  next  section. 


18 


Lo  [1  +  c.  cos(2j^  +  2nox)  +  c*  cos(2}^  -  27t(0t)] 


Ol  m'‘  o; 

0^”+  1  M'"'\  I  0^+  1 


Decision  Mechanism 

Figure  11-2.  Opponent  vision  model  flow  chart 
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2.  Task  3:  Signal  enhancing  properties 

Fig.  II-3  shows  how  the  model  developed  to  account  for  the  psychophysical  results  de¬ 
scribed  above  can  clean  noise  from  spatio-temporal  signals.  Panel  A  of  the  figure  shows  two 
frames  from  a  five-frame  input  sequence.  These  five  frames  were  constructed  from  a  field  of  fil¬ 
tered  noise  by  displacing  a  central  square  by  one  pixel  to  the  right  on  each  frame,  while  the  back¬ 
ground  move  to  the  left  by  one  pixel.  Panel  B  shows  the  response  of  the  model  through  the 
direction-selective  mechanism  stage.  Note  that  the  square  region  can  now  be  seen,  although  it  is 
partially  obscured  by  noise.  Panel  C  shows  the  response  of  the  model  after  the  opponent  division 
operations.  Notice  Aat  noise  is  now  substantially  reduced. 


C:  Opponent  response 


Figure  II-3.  Noise  cleaning  by  opponent  motion  model. 
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This  noise-cleaning  feature  of  the  opponent  division  model  has  been  incorporated  into  a 
scheme  for  moving  target  indication,  as  is  demonstrated  in  Fig.  11-4.  Here,  video  sequences  taken 


(a)  (h) 


Figure  11-4.  Moving  target  ctiehanceniciit.  (a)  Original.  th>  linhanccil  \\  ilh  opponent  nnnion  operalirni. 

from  an  in-llight  helicopter  scanning  the  ground  have  been  enhanced  tt>  indicate  the  presence  of 
vehicles  moving  along  the  ground. 

The  algorithm  for  this  enchancement  operation  is  as  follows: 

( 1 )  Stabilize  multiple  frames  from  the  image  sequence  to  remo\e  camera  induced  image  motion. 

(2)  Within  a  region  of  interest  on  the  resulting  stabilized  image  sequence,  compute  motion  ener¬ 
gy  for  leftward  and  rightward  motion  (<7  andc/^).  These  energy  computations  are  based  on 
the  outputs  of  Hilbert  pairs  of  linear,  spatiotemporally  oriented  filters. 

(3)  Then,  at  each  point  in  the  image  plane,  compute: 


Cl  —  Ck 
Cl  +  CH  +  k 


where  k  is  a  small  additive  constant  that  prevents  division  by  zero,  and  also  serves  to  re¬ 
move  small  amounts  of  image  noise.  (The  results  are  very  insensitive  to  the  exact  choice  of  k, 
within  a  large  range.) 

Notice  in  Fig. 11-4  that  in  each  case  the  moving  vehicle  that  is  nearly  invisible  in  the  original 
sequence  (Panel  a)  became,  in  the  processed  sequence  (Panel  b),  an  easily  visible  intensity  peak 
that  could  be  input  to  an  automatic  target  tracking  system,  or  used  for  visual  reconnaissance. 
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Section  m 


Hierarchical  Architectures  and  Integration  of  Neural  Networks  and 
Knowledge-Based  Systems  for  Intelligent  Robotic  Control . 


A.  BACKGROUND 

Most  conventional  robotic  systems  operate  in  structured  environments  and  are  quite  un¬ 
skilled  by  human  standards,  having  trouble  with  such  seemingly  simple  factory  assembly  tasks  as 
picking  a  part  from  a  bin  or  threading  a  nut  on  a  bolt.  One  important  ingredient  lacking  in  these 
approaches  is  the  system's  ability  to  acquire  sensorimotor  skills  through  learning  and  practice. 

One  popular  model  of  human  skill  acquisition  [IJ  defines  three  phases  of  learning:  (1)  the 
cognitive  phase,  wherein  a  beginner  tries  to  understand  the  task,  (2)  the  associative  phase,  in 
which  patterns  of  response  emerge  and  gross  errors  are  eliminated,  and  (3)  the  autonomous  phase, 
when  task  execution  requires  little  cognitive  control.  Knowledge  can  represented  in  this  skill 
acquisition  model  using  declarative  and  reflexive  forms  of  memory  and  learning  [2].  Motions  in¬ 
dicative  of  declarative  memory  and  learning  require  conscious  effort,  are  characterized  by  infer¬ 
ence,  comparison,  and  evaluation,  and  provide  insight  into  not  only  how  something  is  done,  but 
why.  Motions  involving  reflexive  mechanisms  relate  specific  responses  to  specific  stimuli,  are  au¬ 
tomatic,  and  require  little  or  no  thought. 

Tasks  initially  learned  declaratively  often  become  reflexive  through  repetition.  Conversely, 
when  familiar  tasks  are  attempted  in  novel  situations,  reflexive  knowledge  often  must  be  convert¬ 
ed  back  into  declarative  form  in  order  to  become  useful.  Furthermore,  shifts  from  declarative  to 
reflexive  forms  of  control  often  are  accompanied  by  a  reduced  dependence  on  sensory  informa¬ 
tion  [3,4],  implying  the  utilization  of  learned  predictive  models  of  one's  behavior  and  environ¬ 
ment.  This  learning  and  shifting  of  task-specific  knowledge  between  declarative  and  reflexive 
forms  of  memory  plays  a  fundamental  role  in  human  skill  acquisition,  affecting  computational  re¬ 
source  allocation,  the  focusing  of  attention,  and  the  ability  to  adapt. 

In  addition,  humans  typically  rely  upon  visual  information  for  motor  control,  but  can  with 
practice  switch  to  proprioceptive  control  of  motion  [5-7].  This  ability  is  particularly  useful  be¬ 
cause  vision  is  so  effective  for  monitoring  the  environment  and  planning  motion.  For  example,  in 
sports,  a  novice  must  devote  a  great  deal  of  visual  attention  to  the  control  of  his  or  her  limbs  and 
the  execution  of  those  tasks  necessary  for  play.  This  restricts  the  vistial  resources  available  for 
monitoring  the  opponent  or  field  position.  On  the  other  hand,  an  expert  has  learned,  through  prac¬ 
tice,  motor  programs  that  rely  for  their  execution  predominantly  upon  kirwsthetic  input  from 
limbs  and  muscles  —  leaving  vision  free  to  attend  to  the  other  aspects  of  the  game  [8]. 

B.  OBJECTIVES 

The  goal  of  this  project  was  to  develop  intelligent  sensorimotor  control  systems  that  inte¬ 
grate  declarative  and  reflexive  forms  of  processing  and  multisensor>'  inputs  within  biologically  in¬ 
spired  control  hierarchies  to  enable  high  levels  of  robotic  dexterity  and  adaptability.  These 
approaches  are  to  be  tested  on  a  high-degree-of-freedom  robot.  The  following  is  a  summary  of 
the  Statement  of  Work  contained  in  the  project  proposal; 

Task  1:  Investigate  models  of  the  structural,  functional,  and  behavioral  aspects  of  human  motor 
control  and  skill  acquisition  for  the  development  of  hierarchical  processing  architectures 
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for  intelligent  robotic  control.  Investigate  how  these  biological  paradigms  might  be  used 
to  integrate  knowledge-based  systems  and  neural  networks  for  robotic  skill  acquisition. 

Task  2:  Develop  advanced  neural  network  modules  for  use  in  trajectory  generation,  reflex  gain 
modulation,  and  inverse  kinematic  and  dynamic  transformations. 

Task  3;  Investigate  how  the  resultant  control  technique  might  be  applied  to  complex  dynamic  sys¬ 
tems,  including  a  high-degree-of-freedom  robotic  limb  with  low-level  reflexes  utilizing 
muscle-like  pneumatic  actuators.  Investigate  system  performance  associated  with  learn¬ 
ing  reflex  gain  modulations  and  set-point  adjustments  for  functional  motor  control  tasks. 

C.  RESULTS 
1.  Task  1 

a.  Intelligent  control  architecture 

An  architecture  for  intelligent  sensorimotor  control  has  been  developed  that  emulates  phases 
of  human  motor  skill  acquisition  by  integrating  knowledge-based  systems  and  neural  networks. 
The  robotic  skill  acquisition  architecture  depicted  in  Fig.  ni-l  (below),  is  modelled  after  the  ma¬ 
jor  structural  and  functional  features  of  the  human  motor  control  system. 
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Three  main  levels  of  the  control  hierarchy  are  defined;  joint  reflexes,  motor  synergies,  and 
task-level  execution  components.  On  the  lowest  level,  each  joint  is  endowed  with  reflexes  (servo¬ 
mechanisms)  that  command  torque  as  a  function  of  sensed  joint  position,  velocity  and  torque,  and 
desired  joint  position,  velocity,  and  acceleration.  The  reflexes  are  coordinated  by  motor  syner¬ 
gies,  the  “spinal  cord”  of  this  architecture,  so  as  to  limit  the  number  of  degrees-of-freedom  that 
actually  need  to  be  controlled  by  higher  levels  of  the  system.  Task-level  execution  components 
invoke  motor  synergies  most  appropriate  to  the  task  at  hand,  then  set  the  system  in  motion  by  pro¬ 
viding  the  reflex  loops  and  active  motor  synergies  with  time  varying  gains  and  commands  needed 
to  perform  maneuvers.  Motor  synergy  modulation  and  command  generation  are  carried  out  by 
both  rule-based  and  neural  network-based  task  execution  components,  sometimes  jointly,  some¬ 
times  independently,  with  the  execution  monitor  supervising  various  phases  of  learning  through 
the  manipulation  of  the  mixer  in  Fig.  III-l.  The  execution  monitor  continuously  evaluates  neural 
network  task  execution  performance  and  re-engages  rule-based  components  whenever  errors  due 
to  changes  in  the  dynamic  system  or  its  operating  environment  necessitate  retraining  of  a  network. 
The  rule-based  subsystems  thereby  ensure  proper  task  completion  while  neural  network  re-leam- 
ing  takes  place.  The  manner  in  which  rules  and  networks  interact  during  the  various  phases  of 
learning  and  supervision  gives  this  control  architecture  unique  adaptive  capabilities. 

b.  Robotic  skill  acquisition 

In  an  attempt  to  capture  behavioral  features  of  human-to-human  skill  transfer,  along  with 
their  implications  regarding  the  management  of  attention  and  computation,  our  approach  to  robot¬ 
ic  skill  acquisition  incorporates  both  declarative  and  reflexive  forms  of  processing.  This  architec¬ 
ture  utilizes  transitions  between  declarative  knowledge-based  systems  and  reflexive  neural 
networks  to  enable  system  adaptation  and  optimization.  The  control  scheme  attempts  to  parallel 
the  training  of  an  athlete  (the  robot)  by  a  coach  (the  designer),  whereby  the  robot  learns  through 
experience  how  to  perfect  tasks  initially  specified  in  a  high-level  task  language.  Rule-based  sys¬ 
tem  components  encode  neural  network  learning  strategies,  and  skill  acquisition  is  associated 
with  the  shift  from  a  predominantly  feedback-oriented,  rule-based  representation  of  control  to  a 
predominantly  feedforward,  network-based  form. 

In  this  case,  the  acquisition  of  skill  is  meant  to  imply  a  dramatic  improvement  in  task  perfor¬ 
mance  over  time,  as  well  as  a  significant  decrease  in  the  amount  of  computation  required  to  obtain 
this  performance.  A  reduced  computational  burden  is  desired  in  order  to  mitigate  the  usually  ad¬ 
verse  effects  of  scaling  up  a  problem,  such  as  an  explosive  growth  in  the  number  of  rules  or  exe¬ 
cution  time  required  to  handle  an  increasingly  complex  control  problem.  Fig.  III-2  depicts  how 
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Figure  III-2.  Explicit  and  implicit  functional  dependencies  provided  by  rules  and  neural  networks  within  an 
RSA2  controller. 
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this  reduction  in  computation,  and  hence  execution  time,  is  achieved  here.  Initially,  explicit  con¬ 
trol  strategies  are  conveniently  represented  by  a  hierarchical  knowledge  base.  As  the  system  op¬ 
erates,  the  input/output  relationships  encoded  by  inferencing  rule-based  resources  are  smoothly 
transformed  into  implicit  neural  network  mappings. 

Analogies  to  models  of  human  motor  skill  acquisition  are  used  to  define  transitions  between 
declarative  and  reflexive  modes  of  operation.  The  various  phases  of  robotic  skill  acquisition  are 
depicted  in  Fig.  III-3.  During  the  declarative  phase,  knowledge-based  system  components  dis¬ 
cover  how  to  achieve  rough-cut  task  performance.  Rules  and  conventional  control  algorithms 
provide  for  plan  specification,  desired  trajectory  generation,  and  error-driven  control  commands. 
During  the  hybrid  phase,  neural  networks  learn  by  knowledge-based  example  how  to  accomplish 
parts  of  the  task,  ^owledge-based  and  neural-network-based  components  share  control  respon¬ 
sibility,  with  relatively  poor  initial  network  performance  giving  way  to  robust  patterns  of  learned 
response.  Finally,  during  the  reflexive  phase  of  skill  acquisition,  certain  functions  previously  pro¬ 
vided  by  inferencing  rule-based  resources  are  now  provided  by  memory-intensive  neural  net¬ 
works.  If  desired,  associated  rules  are  conditionally  removed  from  the  decision-making  process. 
When  applicable,  reflexive  neural  network-based  control  is  optimized  through  reinforcement 
learning. 

Improvements  in  system  performance  during  the  transition  from  declarative  to  reflexive  op¬ 
eration  are  accomplished  using  the  neural  network  training  paradigm  of  feedback-error-leaming 


(a)  Declarative  Phase:  plan  specification,  desired  trajectory  generation,  and  error-driven 
control  commands  provided  by  rule-based  system  components. 


(b)  Hybrid  Phase:  neural  networks  contribute  to,  and  learn  from, 
rule-based  control  commands. 


(c)  Reflexive  Phase:  functions  previously  provided  by  inferencing  rule-based  resources 
now  provided  by  memory-intensive  neural  networks. 


Figure  III-3.  Shift  in  representation  of  control  law  during  phases  of  robotic  skill  acquisition. 
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[9].  In  feedback'error-learning,  the  total  control  comniand  is  the  algebraic  sum  of  two  compo¬ 
nents;  (1)  an  error-driven  feedback  component  that  ensures  reasonable,  yet  improvable,  system 
behavior,  and  (2)  a  neural  network-based  component  that  initially  contributes  nothing,  but  learns 
over  time  to  compensate  for  the  inadequacy  of  the  feedback  component: 

Control  Command  =  Feedback  Component  +  Network  Component  ( 1) 


In  an  RSA2  controller,  the  feedback  component  is  knowledge-based,  utilizing  rules  and  con¬ 
ventional  control  algorithms  to  embed  as  much  knowledge  about  successful  control  strategies  as 
possible  (or  practice).  During  the  Hybrid  Phase  of  skill  acquisition  (Fig.  III-3),  the  goal  of  neural 
network  training  is  to  minimize  over  time  this  feedback  component's  contribution  to  the  control 
command,  and  thereby  drive  it  to  zero.  Consequently,  the  feedback  component's  corrective  ac¬ 
tions,  driven  by  discrepancies  between  desired  and  actual  (measured  or  estimated)  trajectories, 
not  only  serve  as  part  of  the  control  law,  but  also  serve  as  neural  network  weight  update  errors. 
Given  adequate  feedback  control  suggestions  and  reasonable  learning  rates,  the  network  compo¬ 
nent  will  learn  the  inverse  dynamics  of  the  system  being  controlled,  in  the  sense  that  it  can  recall 
the  control  command  required  for  a  desired  change  in  system  output. 

The  learning  philosophy  embodied  by  feedback-error-leaming,  used  here  in  conjunction 
with  knowledge-based  systems  and  neural  networks,  permits  analogies  to  be  drawn  to  certain  as¬ 
pects  of  human  skill  acquisition.  A  limited  amount  of  strategic  knowledge  initiates  motion.  Per¬ 
formance  improves  incrementally  through  learning,  with  inferential  problem  solving  giving  way 
to  reflexive  motor  programs.  Computational  efficiency  is  intended  for  areas  of  the  dynamic  state- 
space  visited  often,  as  in  repetitive  maneuvers.  Inferential  problem  solving  remains  ready,  how¬ 
ever,  to  handle  infrequently  executed  tasks,  or  changes  in  the  robot  or  its  environment  that  render 
previously  acquired  expertise  ineffective.  The  RSA2  control  technique  provides  a  way  to  com¬ 
bine  rules,  neural  networks,  and  feedback-error-leaming  to  enable  such  adaptive  behavior. 

2.  Task  2 

a.  Neural  network  architectures 

The  type  of  neural  network  architecture  used  in  a  control  problem  has  a  major  impact  on 
system  learning  and  performance.  It  is  well  known  that  many  biological  sensorimotor  control 
structures  in  the  brain  are  organized  using  neurons  that  possess  locally  tuned  overlapping  recep¬ 
tive  fields.  The  main  benefits  of  using  local  approximation  techniques  to  represent  nonlinear  sys¬ 
tem  functions  are  faster  learning,  compared  to  the  global  approaches,  and  the  ability  to  train  the 
network  in  one  part  of  the  input  space  without  corrupting  what  has  already  been  learned  at  more 
distant  points  in  the  input  space. 

Our  approach  to  neural  network-based  control  utilizes  network  architectures  suitable  for  on¬ 
line  learning.  Our  recent  work  has  indicated  that  shifts  in  control  between  the  rule-based  and  neu¬ 
ral  network  components  of  Fig.  ni-2  can  be  accomplished  on-line  using  the  fast  learning  capabil¬ 
ities  of  CMAC  neural  networks.  It  was  shown  that  the  use  of  B-Spline  receptive  field  functions 
enables  higher-order  CMAC  neural  networks  [10]  to  be  constructed  that  can  learn  both  functions 
and  function  derivatives.  This  ability  coupled  with  the  computational  efficiency  of  CMAC  neural 
networks  allows  on-line  consuuction  of  multi-dimensional  functions  and  their  Jacobian  matrices 
for  use  in  inverse  kinematics  ana  dynamics  transformations  and  reinforcement  learning. 

Multi-layer  perceptron  neural  networks  [1 1]  are  often  slow  to  leam  nonlinear  functions  with 
complex  local  structure  due  to  the  global  nature  of  their  function  approximations.  Neural  net¬ 
works  based  on  local  approximations,  such  as  CMACs,  are  capable  of  learning  nonlinear  func¬ 
tions  with  localized  structure  quickly,  but  may  generalize  poorly  and  can  require  basis  set  sizes 
that  scale  exponentially  with  the  dimension  of  the  input  space.  We  have  shown  that  CMAC  neu¬ 
ral  networks  with  B-Spline  receptive  field  functions  can  be  incorporated  into  the  node  connection 
functions  computed  in  multi-layer  perceptrons  [12].  This  allows  Spline  Net  avchitectures  to  be 
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developed  that  are  also  suitable  for  on-line  learning  by  combining  the  generalization  capabilities 
and  scaling  properties  of  global  multi-layer  feedforward  netwoiics  with  the  computational  effi¬ 
ciency  and  learning  speed  of  local  network  paradigms 

b.  Neural  network  training  paradigms 

In  our  approach  to  intelligent  robotic  control,  many  of  the  system  components  in  Fig.II-3  are 
decomposed  into  feedback  and  neural  network  subsystems  as  shown  in  Fig.  III-4.  The  output  of 
the  feedback  and  neural  network  subsystems,  and  a"",  respectively,  are  summed  to  obtain  the 
total  control  command,  u.  x®  is  the  vector  of  desired  states,  and  x  is  the  vector  of  actual  state  vd- 
ues.  As  the  neural  network-based  subsystem  is  trained,  the  feedback  control  law  contribution, 
is  driven  to  zero,  and  in  the  process,  the  neural  network-based  component  learns  the  inverse  dy¬ 
namics  of  the  system  (in  the  sense  that  it  can  compute  the  required  control  command  for  a  desired 
change  in  the  system  output).  When  the  desired  state  values,  x“,  are  used  as  network  inputs  in¬ 
stead  of  actual  state  values,  x,  system  operation  can  be  smoothly  shifted  from  a  predominantly 
feedback  form  of  control  to  a  predominantly  feedforward  form.  The  neural  network  component, 
is  trained  using  a  quadratic  cost  function  of  a™  (or  a  differenced  version  -  [ii^  ]|^.i)  to 
minimize  the  feedback  component's  contribution  to  the  total  control  command,  u.  The  k  subscript 
represents  the  value  of  the  control  variables  at  time,  This  learning  paradigm  is  commonly  re¬ 
ferred  to  in  the  literature  as  Feedback-Error-Leaming  [10]. Reinforcement  learning  optimization  is 
used  to  refine  “rough-cut”  task  execution  produced  by  constant  coefficient  motor  synergies  and 
primitive  (untrained)  joint  servo-reflexes.  This  is  done  by  first  training  the  neural  networks  at  the 
joint  reflex  level  to  represent  the  inverse  dynamics  of  the  system.  When  this  phase  of  learning  is 
completed,  central  pattern  generator  (CPG)  neural  networks  [13,14]  residing  at  the  motor  synergy 
level  of  the  controller  are  trained  to  modulate  the  synergy  strength  coefficients  as  a  function  of 
time  and/or  the  desired  system  state  in  order  to  minimize  the  amount  of  energy  expended  or  the 
rate-of-change  of  torque  applied.  Minimization  of  the  quadratic  cost  functions  associated  with 
energy  or  rate-of-change  of  torque  requires  solving  a  two-point  boundary  value  problem  due  to 
the  split  boundary  conditions  on  the  state  and  adjoint  equations.  As  a  result,  the  reinforcement 
learning  optimization  utilizes  a  sweep  method  [15]  to  iteratively  solve  for  the  optimal  synergy 
strengths.  The  forward  sweep  is  accomplished  as  a  maneuver  is  performed  by  storing  a  trace  of 
the  relevant  system  variables  in  a  short-term  memory  buffer.  Once  it  has  been  determined  that  the 
maneuver  has  ended,  the  system  “thinks”  about  what  it  has  just  done  by  sweeping  the  adjoint  sys- 
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tem  of  equations  backwards  in  time  in  order  to  make  the  next  repetition  of  the  maneuver  closer  to 
optimal.  Since  derivatives  of  the  neural  network-based  joint  reflexes  are  required  in  the  reinforce¬ 
ment  learning  optimization,  differentiable  neural  networks  capable  of  local  function  approxima¬ 
tion  such  as  SplineNets  [12]  or  BMACs  [9]  are  used.  The  advantage  of  using  recurrent  networks 
such  as  Jordan  Nets  [14]  to  implement  central  pattern  generators  is  that  the  synergy  strengths  can 
be  modulated  as  either  periodic  or  non-periodic  functions  of  time.  If  feedforward  networks  such 
as  BMACs  [9]  or  Splinets  are  used,  then  the  synergy  strength  coefficients  will  be  modulated  in  a 
chained  response  fashion  [13,16,17]  based  on  the  current  value  of  the  manipulator  state.  In  either 
case,  the  use  of  central  pattern  generator  neural  networks  allows  the  optimal  maneuvers  learned 
through  practice  to  be  generalized  across  space  and  time. 

c.  Learning  automatic  behaviors  in  multi-sensory  robotic  systems 

One  example  of  a  multi-sensory  integrated  approach  to  robotic  systems  is  presented  in  this 
report.  A  more  complete  discussion  of  these  systems  and  other  examples  are  given  in  Gelfand  et 
al.  [18].  Automatic  control  of  a  sensory-motor  task  is  acquired  through  practice  in  an  integrated 
system  that  uses  visual  input  to  execute  a  task  and  train  a  control  system  to  perform  that  task  using 
sensory  inputs  from  Joint  position  sensors. 

A  schematic  diagram  of  a  multisensory  teaming  and  control  system  is  shown  in  Fig.  III-5.  A 
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Figure  III-5.  A  schematic  diagram  of  a  hybrid  learning  and  control  system.  This  system  plans  and  executes  the 
motion  of  an  arm  using  visual  input  and  trains  the  arm  to  perform  the  task  using  fe^back  from  position  sensors 
in  the  actuators. 


simulated  robot  manipulator  performs  a  task  with  a  machine  vision  system  initially  determining 
the  appropriate  trajectory  of  the  manipulator  based  on  relevant  information  about  the  work  space. 
This  visual  information  is  fed  to  the  modules  marked  visual  planner  and  visual  control.  The  visu¬ 
al  control  module  uses  the  visual  feedback  of  the  position  of  the  arm  to  execute  movement  along 
the  planned  path.  During  the  execution  of  this  visually  guided  motion,  proprioceptive  sensors 
provide  information  about  the  arm's  state  to  a  CM  AC  neural  network  [19].  This  network  is 
trained  to  provide  the  proper  control  outputs  to  cause  the  arm  to  move  in  the  same  path  as  under 
visual  system  control.  This  CMAC  controller  provides  a  direct  coupling  from  proprioceptive  in¬ 
put  to  motor  output  for  only  that  portion  of  trajectory  space  in  which  the  response  was  learned  in 
performing  the  task.  The  process  described  above  is  supervised  by  an  execution  monitor  responsi¬ 
ble  for  monitoring  the  performance  of  the  kinesthetic  control  system  relative  to  the  visually  con¬ 
trolled  system  and  for  switching  control  between  them.  The  executitMi  monitor  also  monitors  the 
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gross  performance  of  the  system.  If  problems  are  encountered,  such  as  an  unexpected  collision, 
control  may  be  switched  back  to  the  visual  system,  which  with  its  visual  sensing  of  the  whole 
workspace  and  general  algorithmic  controller  allows  for  comprehensive  diagnostics  and  possible 
retraining. 

d.  Learning  control  of  an  arm  in  the  presence  of  an  obstacle 

In  this  demonstration,  we  use  a  visual  system  to  locate  an  object  in  two-dimensional  space 
and  to  control  the  motion  of  the  two  link  manipulator.  As  shown  in  Figs.  Ill-ba  and  6b,  a  CMAC 
was  trained  to  control  the  position  of  the  manipulator  as  a  function  of  measured  joint  angles.  Dur¬ 
ing  the  training  passes,  the  RMS  difference  between  the  visually  controlled  manipulator  position 
and  the  position  suggested  by  the  CMAC  is  monitored  and  used  to  determine  when  the  CMAC 
has  adequately  learned  the  desired  trajectory.  The  execution  monitor  then  switches  control  from 
visually  guided  motion  to  kinesthetically  controlled  motion. 

Referring  to  Fig.  III-5,  we  see  a  two-link  manipulator  constrained  to  a  horizontal  plane.  The 
arrangement  of  the  manipulator,  the  object,  and  the  visual  system  are  shown.  For  the  sake  of  this 
demonstration  we  used  a  simple  binocular  visual  system  that  locates  the  object  in  space  using  the 
angles  from  the  object  to  the  sensors.  The  path  was  calculated  by  first  determining  a  point  of  clos¬ 
est  allowable  approach  based  on  the  size  of  the  end  effector.  This  point  and  the  given  initial  and 
final  end  effector  positions  were  used  to  compute  a  spline  function  representing  a  desired  trajecto¬ 
ry.  The  visual  system  monitors  the  position  of  the  end  effector  as  the  motion  is  controlled  by 
torques  calculated  by  the  inverse  dynamics  of  the  arm.  As  the  arm  moves,  the  CMAC  is  given  as 
input  the  current  joint  angles,  joint  velocities,  and  desired  joint  angles  at  the  end  of  the  segment. 
The  CMAC  is  trained  to  output  the  required  torques  at  each  joint  to  produce  the  desired  end  effec¬ 
tor  trajectory.  The  training  consists  of  comparing  the  torque  output  of  the  inverse  dynamic  con¬ 
troller  with  that  of  the  CMAC  and  training  the  weights  by  the  standard  CMAC  learning  algorithm 
[19].  When  the  error  falls  below  a  predetermined  level,  control  is  switched  from  visual  input 
based  on  end  effector  position  to  the  CMAC. 

The  results  of  this  demonstration  are  shown  in  Figs.  ni-6a  am  6b.  These  figures  depict  the 
behavior  of  the  system  after  the  indicated  number  of  runs.  Each  training  run  consists  of  a  com¬ 
plete  sweep  of  the  trajectory  from  the  initial  position  to  the  final  position.  In  each  figure,  we  use  a 
thin  line  to  indicate  the  actual  trajectory  of  the  end  effector  as  controlled  by  the  visual  input  con- 
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Figure  in-6a  and  6b.  View  from  above  the  robotic  arm  under  visual  control  training  a  CMAC  neural  network  to 
execute  the  same  trajectory  using  joint  angle  feedback.  The  graph  at  the  bottom  of  each  figure  depicts  the  RMS 
difference  between  the  visual  and  CMAC  control  as  discussed  in  tlw  text.  In  (a),  the  arm  in  its  seventh  sweep 
and  control  remains  under  the  visual  system.  In  (b),  control  of  the  arm  has  been  transferred  to  the  CMAC. 
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troller.  The  heavy  lines  indicate  the  motion  that  would  result  from  using  the  commands  from  the 
CMAC  controller.  At  the  bottom  of  each  figure,  we  show  the  RMS  differences  of  the  joint  angles 
between  the  CMAC-controlled  and  visually  control  trajectories  plotted  against  the  number  of 
training  runs.  In  Fig.  6a,  the  lines  from  the  robot's  binocular  visual  sensors  to  the  end  effector  in¬ 
dicate  that  the  system  is  under  visual  control.  We  can  see  that  the  output  of  the  CMAC  begins  to 
approach  the  desired  path.  The  RMS  difference  becomes  smaller  and  the  trajectories  depicted  by 
the  thin  and  heavy  lines  become  coincident.  In  Fig.  III-6b,  we  show  the  performance  of  the  sys¬ 
tem  after  control  has  been  transferred  to  the  CMAC. 


3.  Task  3 

Our  hybrid  rule-based/neural  network  control  technique  was  initially  used  to  integrate 
knowledge-based  system  and  neural  network  techniques  for  the  control  of  a  two-link  manipulator. 
A  simulation  was  constructed  in  which  a  neural  network  learned  how  to  perform  a  tennis-like  ma¬ 
nipulator  swing.  The  control  system  utilized  rule-based  components  to  initiate  the  swinging  ma¬ 
neuver  and  train  the  neural  network.  It  was  shown  that  for  the  manipulative  task  investigated, 
shifts  between  declarative  and  reflexive  processing  occurred  smoothly,  with  no  stability  problems, 
and  could  be  traced  by  variations  in  the  number  of  rules  being  tested  and  the  network  output  er¬ 
rors  during  learning.  Additionally,  real-time  performance  on  economical  hardware  was  indicated. 

This  control  approach  has  subsequently  been  applied  succe.ssfully  (in  simulation)  to  the  con¬ 
trol  of  the  redundant  six-link  anthropomorphic  robot  shown  in  Fig.III-7  [21] 


Figure  III-7.  SLIM,  a  planar,  six-link,  five-joint  robot  that  “stands”  six-feet  tall. 


In  addition  to  simulation,  we  have  also  begun  to  implement  aspects  of  the  hybrid  architec¬ 
ture  to  a  physical  version  of  SLIM  (Skill  Learning  Intelligent  Manipulator).  Here,  we  describe 
the  hardware  and  its  current  level  of  functioning.  It  is  important  to  recognize  that  the  hardware 
lags  our  simulations  because  of  the  non-ideal  behavior  of  the  structure  and  the  actuators. 

SLIM  is  a  planar,  six-link,  five-joint  robot  that  “stands”  roughly  six-feet  tall.  The  robot 
looks  like  a  person  in  profile.  It  is  made  of  light-weight  aluminum  I-beams  that  are  hinged  at  the 
joints  using  ball  bearings.  Each  joint  is  controlled  by  a  pair  of  soft  pneumatic  actuators  known  as 
rubbertuators.  These  rubbertuators  are  antagonistically  ananged  and  each  can  develop  about  350 
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lbs.  of  force.  The  rubbertuators  are  made  by  the  Bridgestone  Corporation  of  Japan.  The  artificial 
muscles  are  driven  by  proportional  control  valves  that  serve  the  pressure  in  each  muscle  to  a  value 
determined  by  the  control  software.  The  force  in  each  muscle  is  sensed  by  a  load  cell,  and  the 
joint  angles  are  sensed  by  linear  potentiometers. 

Two  IBM-compatible  386/33  MHz  computers  are  used  to  implement  the  control  algorithm 
and  display  the  state  of  the  manipulator.  The  two  computers  are  linked  by  a  high  bandwidth  fiber¬ 
optic  data  link.  One  computer  uses  a  PID  control  based  on  angle  to  get  SLIM  to  achieve  a  desired 
posture  as  determined  by  program  running  on  the  other  computer.  The  posture -determining  algo¬ 
rithm  is  a  modified  Berkinblitt  approach  that  is  a  forward  approximation  solution  to  the  inverse 
kinematics  of  the  link-redundant  robot  [22].  The  algorithm  is  modified  to  improve  convergence 
by  inclusion  of  low-level  muscle  synergies  (reflexes)  that  allow  the  coordinated  withdrawal  or  ex¬ 
tension  of  the  arm  or  leg.  Without  these  synergies,  each  Joint  acts  separately,  and  motions  such  as 
limb  extension  or  withdrawal  proceed  slowly.  With  these  elements  (posture  controller  and  joint 
angle  PID  controller),  we  have  gotten  SLIM  to  stand  and  exercise  free  motion  by  tracking  his 
end-point  (end  of  arm)  along  a  line  of  arbitrary  inclination,  or  along  a  circle. 

We  have  also  added  active  joint  compliance  control  so  that  any  joint  of  the  robot  can  be 
stiffened  or  loosened  at  will.  Recall  that  SLIM  is  a  compliant  structure,  so  that,  in  response  to  ex¬ 
ternal  push,  he  will  give  way  to  a  degree  determined  by  the  joint  compliance.  Recently,  we  have 
added  a  CMAC  network  to  improve  the  tracking  ability  of  the  PID  controller  for  one  joint  of  the 
robot.  The  CMAC  is  necessary  because  the  rubbertuators  are  non-linear  so  that  PID  gains  must 
vary  for  different  postures  to  obtain  optimal  and  stable  performance.  The  CMAC  feed-forward 
controller  learns  to  compensate  so  that  knee  position  error  is  reduced.  In  essence,  the  CMAC  is 
learning  the  inverse  dynamics  about  a  given  joint  angle  and  the  information  is  used  to  create  a  bet¬ 
ter  controller.  The  learning  method  here  is  due  to  Kawato,  and  its  trains  the  CMAC  to  drive  the 
feedback  component  to  zero.  We  are  presently  generalizing  this  result  to  other  joints  of  the  robot. 

To  deal  with  computing  limitations,  we  have  also  developed  a  new  Digital  Signal  Processor 
(DSP)  architecture  and  have  begun  to  implement  it  on  SLIM.  The  architecture  involves  the  use  of 
both  shared  and  global  memory  and  has  a  low-speed  ascending  and  descending  data  bus  in  analo¬ 
gy  with  the  ascending  and  descending  pathway  in  the  human  spinal  cord. 
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